Goto

Collaborating Authors

 Boulder County


Accurate and Reliable Uncertainty Estimates for Deterministic Predictions Extensions to Under and Overpredictions

Bandy, Rileigh, Camporeale, Enrico, Hu, Andong, Berger, Thomas, Morrison, Rebecca

arXiv.org Machine Learning

Computational models support high-stakes decisions across engineering and science, and practitioners increasingly seek probabilistic predictions to quantify uncertainty in such models. Existing approaches generate predictions either by sampling input parameter distributions or by augmenting deterministic outputs with uncertainty representations, including distribution-free and distributional methods. However, sampling-based methods are often computationally prohibitive for real-time applications, and many existing uncertainty representations either ignore input dependence or rely on restrictive Gaussian assumptions that fail to capture asymmetry and heavy-tailed behavior. Therefore, we extend the ACCurate and Reliable Uncertainty Estimate (ACCRUE) framework to learn input-dependent, non-Gaussian uncertainty distributions, specifically two-piece Gaussian and asymmetric Laplace forms, using a neural network trained with a loss function that balances predictive accuracy and reliability. Through synthetic and real-world experiments, we show that the proposed approach captures an input-dependent uncertainty structure and improves probabilistic forecasts relative to existing methods, while maintaining flexibility to model skewed and non-Gaussian errors.


Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding

Hahn, Maximilian, Zajak, Alina, Heider, Dominik, Ribeiro, Adèle Helena

arXiv.org Machine Learning

Causal discovery across multiple datasets is often constrained by data privacy regulations and cross-site heterogeneity, limiting the use of conventional methods that require a single, centralized dataset. To address these challenges, we introduce fedCI, a federated conditional independence test that rigorously handles heterogeneous datasets with non-identical sets of variables, site-specific effects, and mixed variable types, including continuous, ordinal, binary, and categorical variables. At its core, fedCI uses a federated Iteratively Reweighted Least Squares (IRLS) procedure to estimate the parameters of generalized linear models underlying likelihood-ratio tests for conditional independence. Building on this, we develop fedCI-IOD, a federated extension of the Integration of Overlapping Datasets (IOD) algorithm, that replaces its meta-analysis strategy and enables, for the fist time, federated causal discovery under latent confounding across distributed and heterogeneous datasets. By aggregating evidence federatively, fedCI-IOD not only preserves privacy but also substantially enhances statistical power, achieving performance comparable to fully pooled analyses and mitigating artifacts from low local sample sizes. Our tools are publicly available as the fedCI Python package, a privacy-preserving R implementation of IOD, and a web application for the fedCI-IOD pipeline, providing versatile, user-friendly solutions for federated conditional independence testing and causal discovery.